40 research outputs found
Fast MLE Computation for the Dirichlet Multinomial
Given a collection of categorical data, we want to find the parameters of a
Dirichlet distribution which maximizes the likelihood of that data. Newton's
method is typically used for this purpose but current implementations require
reading through the entire dataset on each iteration. In this paper, we propose
a modification which requires only a single pass through the dataset and
substantially decreases running time. Furthermore we analyze both theoretically
and empirically the performance of the proposed algorithm, and provide an open
source implementation
Relative Probability on Finite Outcome Spaces: A Systematic Examination of its Axiomatization, Properties, and Applications
This work proposes a view of probability as a relative measure rather than an
absolute one. To demonstrate this concept, we focus on finite outcome spaces
and develop three fundamental axioms that establish requirements for relative
probability functions. We then provide a library of examples of these functions
and a system for composing them. Additionally, we discuss a relative version of
Bayesian inference and its digital implementation. Finally, we prove the
topological closure of the relative probability space, highlighting its ability
to preserve information under limits.Comment: 30 pages, 11 figure
The Dualism of Contemporary Traditional Governance and the State
In many parts of the world, people live in “dual polities”: they are governed by the state and organize collective decision making within their ethnic community according to traditional rules. We examine the substantial body of works on the traditional–state dualism, focusing on the internal organization of traditional polities, their interaction with the state, and the political consequences of the dualism. We find the descriptions of the internal organization of traditional polities scattered and lacking comparative perspective. The literature on the interaction provides a good starting point for theorizing the strategic role of traditional leaders as intermediaries, but large potentials for inference remain underexploited. Studies on the consequences of “dual polities” for democracy, conflict, and development are promising in their explanatory endeavor, but they do not yet allow for robust conclusions. We therefore propose an institutionalist research agenda addressing the need for theory and for systematic data collection and explanatory approaches
Comparative genetic architectures of schizophrenia in East Asian and European populations
Schizophrenia is a debilitating psychiatric disorder with approximately 1% lifetime risk globally. Large-scale schizophrenia genetic studies have reported primarily on European ancestry samples, potentially missing important biological insights. Here, we report the largest study to date of East Asian participants (22,778 schizophrenia cases and 35,362 controls), identifying 21 genome-wide-significant associations in 19 genetic loci. Common genetic variants that confer risk for schizophrenia have highly similar effects between East Asian and European ancestries (genetic correlation = 0.98 ± 0.03), indicating that the genetic basis of schizophrenia and its biology are broadly shared across populations. A fixed-effect meta-analysis including individuals from East Asian and European ancestries identified 208 significant associations in 176 genetic loci (53 novel). Trans-ancestry fine-mapping reduced the sets of candidate causal variants in 44 loci. Polygenic risk scores had reduced performance when transferred across ancestries, highlighting the importance of including sufficient samples of major ancestral groups to ensure their generalizability across populations
Detecting Trending Venues Using Foursquare's Data
ABSTRACT Foursquare is a search and discovery tool which helps users discover venues around the world. Much of the data for these recommendations come from its sister app Swarm, which is a location based social network where users can "check in" to places they visit. Older versions of Foursquare had a strongly static component to its recommendations. For instance, the top restaurants in New York City do not vary from month to month, and venues with years of consistently strong signals will dominate search results. In this paper we outline a new algorithm which Foursquare uses in order to discover fresh recommendations. Promoting younger venues with fewer check-ins or older venues with a recent surge of activity increases turnover in our recommendations and yields a better user experience
Recommended from our members
Predicting the temporal activity patterns of new venues.
Estimating revenue and business demand of a newly opened venue is paramount
as these early stages often involve critical decisions such as first rounds of staffing
and resource allocation. Traditionally, this estimation has been performed through
coarse-grained measures such as observing numbers in local venues or venues at
similar places (e.g., coffee shops around another station in the same city). The
advent of crowdsourced data from devices and services carried by individuals on a
daily basis has opened up the possibility of performing better predictions of
temporal visitation patterns for locations and venues. In this paper, using mobility
data from Foursquare, a location-centric platform, we treat venue categories as
proxies for urban activities and analyze how they become popular over time. The
main contribution of this work is a prediction framework able to use characteristic
temporal signatures of places together with k-nearest neighbor metrics capturing
similarities among urban regions, to forecast weekly popularity dynamics of a new
venue establishment in a city neighborhood. We further show how we are able to
forecast the popularity of the new venue after one month following its opening by
using locality and temporal similarity as features. For the evaluation of our
approach we focus on London. We show that temporally similar areas of the city
can be successfully used as inputs of predictions of the visit patterns of new
venues, with an improvement of 41% compared to a random selection of wards as
a training set for the prediction task. We apply these concepts of temporally
similar areas and locality to the real-time predictions related to new venues and
show that these features can effectively be used to predict the future trends of a
venue. Our findings have the potential to impact the design of location-based
technologies and decisions made by new business owners